Python Data Science Essentials by Boschetti Alberto & Massaron Luca
Author:Boschetti, Alberto & Massaron, Luca [Boschetti, Alberto]
Language: eng
Format: azw3
Publisher: Packt Publishing
Published: 2015-04-29T16:00:00+00:00
If the value of the outliers fraction is very small, nu will be small and the SVM algorithm will try to fit the contour of the data points. On the other hand, if the fraction is high, so will be the parameter, forcing a smoother boundary of the inliers' distributions.
Let's immediately observe the performance of this algorithm on the problem that we faced before on the Boston house price dataset:
In: from sklearn.decomposition import PCA from sklearn import preprocessing from sklearn import svm # Normalized data relative to continuous variables continuous_variables = [n for n in range(np.shape(boston.data)[1]) if n!=3] normalized_data = preprocessing.StandardScaler().fit_transform(boston.data[:,continuous_variables]) # Just for visualization purposes pick the first 5 PCA components pca = PCA(n_components=5) Zscore_components = pca.fit_transform(normalized_data) vtot = 'PCA Variance explained ' + str(round(np.sum(pca.explained_variance_ratio_),3)) # OneClassSVM fitting and estimates outliers_fraction = 0.02 # nu_estimate = 0.95 * outliers_fraction + 0.05 machine_learning = svm.OneClassSVM(kernel="rbf", gamma=1.0/len(normalized_data), degree=3, nu=nu_estimate) machine_learning.fit(normalized_data) detection = machine_learning.predict(normalized_data) outliers = np.where(detection==-1) regular = np.where(detection==1) # Draw the distribution and the detected outliers from matplotlib import pyplot as plt for r in range(1,5): a = plt.plot(Zscore_components[regular,0],Zscore_components[regular,r], 'x', markersize=2, color='blue', alpha=0.6, label='inliers') b = plt.plot(Zscore_components[outliers,0],Zscore_components[outliers,r], 'o', markersize=6,color='red', alpha=0.8, label='outliers') plt.xlabel('Component 1 ('+str(round(pca.explained_variance_ratio_[0],3))+')') plt.ylabel('Component '+str(r+1)+'('+str(round(pca.explained_variance_ratio_[r],3))+')') plt.xlim([-7,7]) plt.ylim([-6,6]) plt.legend((a[0],b[0]),('inliers','outliers'),numpoints=1,loc='best') plt.title(vtot) plt.show()
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Coding Theory | Localization |
Logic | Object-Oriented Design |
Performance Optimization | Quality Control |
Reengineering | Robohelp |
Software Development | Software Reuse |
Structured Design | Testing |
Tools | UML |
Deep Learning with Python by François Chollet(12520)
Hello! Python by Anthony Briggs(9867)
OCA Java SE 8 Programmer I Certification Guide by Mala Gupta(9757)
The Mikado Method by Ola Ellnestam Daniel Brolund(9747)
Dependency Injection in .NET by Mark Seemann(9293)
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8258)
Test-Driven iOS Development with Swift 4 by Dominik Hauser(7741)
Grails in Action by Glen Smith Peter Ledbrook(7667)
The Well-Grounded Java Developer by Benjamin J. Evans Martijn Verburg(7517)
Becoming a Dynamics 365 Finance and Supply Chain Solution Architect by Brent Dawson(6743)
Microservices with Go by Alexander Shuiskov(6510)
Practical Design Patterns for Java Developers by Miroslav Wengner(6408)
Test Automation Engineering Handbook by Manikandan Sambamurthy(6386)
Secrets of the JavaScript Ninja by John Resig Bear Bibeault(6378)
Angular Projects - Third Edition by Aristeidis Bampakos(5765)
The Art of Crafting User Stories by The Art of Crafting User Stories(5296)
NetSuite for Consultants - Second Edition by Peter Ries(5241)
Demystifying Cryptography with OpenSSL 3.0 by Alexei Khlebnikov(5058)
Kotlin in Action by Dmitry Jemerov(5019)
